greatcfandomcom-20200214-history
Presentation001
__TOC__ =I - Managing 10 Billion Processors= A Presentation on Cellular Operating Systems 10^{10} or more. Purpose of Presentation Explain the Design of Cellular Operating Systems Point of Presentation The way that an operating system manages 1,000 processors is so fundamentally different than the manner in which it manages one or two processors that the entire design of an operating system must be rethought. (Wentzlaff, Agarwal, 2009) COS are a New Way of Building Scalable Multiprocessor Operating Systems Related Wiki Pages * Cooperating Operating systems * Problem Definition * Background * Eager Queue Protocols * Astract Operating Systems Slide: Introduction What is a Cooperating Operating System? This presentation will explain what a COS is and how it is implemented. It will start by explaining the design problem. Section 1: The Problems Section Overview This is very simplistic at the moment. * Single Processors can't get any faster as they get too hot! ** Solution: Build Multiprocessors * Problem: We can't program multiprocessors efficiently ** Solution: Design a Scalable Multiprocessor OS ** Base the Solution on an SDMI Architecture SDMI - The Blitz SDMI architectures have been constructed before. The most famous example is London during the Blitz. * Single Data - Air Raid Warning - Triggers: ** People to Search Lights ** People to Guns ** People to Shelters ** People to Streets ** Peolpe to Fire Engines ** People to Fighter Planes Slide: Too Fast - Too Hot! Section 2: Background Section Overview This section is a bit about my background. The research I did from 1985 - 1989 into Adaptable Processor Systems. The difficulties with the work. How later work I did in 2004 on monitoring the state of networked servers got me thinking again about adaptable processor systems. My current research into multicore/multiprocessors. The state of the process calculi field. Attempts to develop a calculus for higher order abstract operating systems. Slides * Research into Adaptable Processor Systems * CHOCS * Network etensible Window System * Catt Spiral * Kernel Logic Machine * Work on the Child Trust Fund * Study of Minix * Current state of process calculi * Modelling of Operating Systems using Z Section 3: The Current Multiprocessor Landscape Section Overview * What is happening at the moment in the multiprocessor field? * What are the recurring problems? * What significant results are there: ** Factored Operating Systems (fos): The Case for a Scalable Operating System for Multicores ** Best of both worlds: A bus enhanced NoC (BENoC) ** Singularity: Rethinking the Software Stack ** On the Potential of NoC Virtualization for Milticore Chips ** Power Management Enhancements in the 45nm Intel Core Microarchitecture ** Using Asymmetric Single-ISA CMPs to Save Energy on Operating Systems Slides: Multiprocessors * Intel SCC * Intel Larabee * Free Scale MSC8156 * Telera TilePro64 * The Kernel Logic Machine * Apple's GCD (Grand Centralised Distributer) Section 4: Design a Scalable General Purpose Multiprocessor Operating System Section Overview In designing a multiprocessor operating system as well as being able to manage processors and processes there are many other sub-problems. Can the systems be energy efficient, can it handle failures of processors, can systems be built that are correct? This section looks at these sub-problems. * Design a Scalable General Purpose Multi-Processor Operating System * Sub Problems: ** Correctness (Logical Reliability) ** Physical Reliability ** Energy Efficiency ** State Awareness Slides Section 5: Eager Queue Protocols Section Overview Eager Queue Protocols are a core part of the design being presented. They are used to separate system management from system functions. This section starts with the design of the initial eager queue protocol and progresses to the options available for the use of an EQP in a Cooperating System. Points to Cover: * Employment Agency analogy. * The queue is a replicated shared resource. * The messages of the basic protocol: ** JOIN ** EXIT ** UPDATE ** REQUEST ** ACCEPT ** DECLINE * Performance of the protocol ** How the performance can be varied dependent on the hostility of the environment. Section 6: Abstract Operating Systems Section Overview If we look at many simple Operating System kernels, there are essentially 3 functions: # Process Management # Process Communication # Scheduler The scheduler is normally present as there are more processes than processors. If we push things to the limit and assume we have an unlimited number of processors where does this leave us? Space multiplexing replaces time multiplexing (scheduling) and we are left with two functions: # Process Management. # Process Communication. Process communication can occur independently of the operating system. This leaves us with the observation that an operating system's primary function is process management. Slides: The Operating System Landscape * The Nucleus of a Multiprogramming System (Brinch Hansen, 1970) * Minix * Linux * Craig's Simple Kernel * GNU Hurd ** A Critique of the GNU Hurd Multi-server Operating System * Systolic Arrays * RMoX Slides: Abstract Operating System 1 * Unlimited Number of Processors * Interprocessor Communication Section 7: Cooperating Operating Systems Section Overview Cooperating Operating Systems bring the ideas of Eager Queue Protocols and Abstract Operating Systems together. * A two dimensional chip layout is proposed. * Energy efficiency is introduced using Q-Cells as controllers. * Fault tolerance is added using ideas from the Kernel Logic Machine. * Feedback loops are added by extending the PQ-Cell API. * A COS is a MISD architecture (MISD: No examples today - Review:(Patterson, Hennessy, 2008)). * Hormones and Homeostasis as an analogy of a COS. Section 8: Physical Implementation Section Overview How practical is a COS? Things to consider include: * Broadcasting on Chips / Across Chips * Broadcast message rate in relation to process life time * Size of processor network * Message routing scheme in relation to Q-Cell "Knowledge" * Adding a virtualization layer. * Use of topology-agnostic routing algorithms. * Snoopy protocol with broadcast packets. * Can Q-Cells help with cache coherency? Section 9: Further Work Section Overview Summarise what has been achieved so far. Look at how to continue the work. * Extend EQP so that it can work across dynamic sub-networks * Look at EQP as a "Piggy Back" protocol. * Look at the implications of implementing a COS in silicon * Extend the simulations and write a better simulator * Look at a real implementation of a COS Last Slide Key Points # COS is aimed at general purpose computing. # COS can configure specialist processor networks, e.g. systolic arrays. # Separates the Operating System from the Applications # Decentralised Process/Processor Management # Uses Eager Queue Protocols to Manage Distributed System State # Eager Queue Protocols are broadcast based and minimise collisions # Eager Queues use deltas to update Q-Cells # Aims to run with Just Enough Power # Fast Boot as each process boots in parallel # Processes run on isolated Processors (cores) # Simple Cell Architecture - PQ-Cells # Fault Tolerant using ideas of Ivor Catt Category:Research Category:Research